Demystifying MapReduce

نویسنده

  • Christopher Garcia
چکیده

Recent innovations in Big Data have enabled major strides forward in our ability to glean important insights from massive amounts of data, and to use these insights to make better decisions. Underlying many of these innovations is a computational paradigm known as MapReduce, which enables computational processes to be scaled up to very large sizes and to take advantage of cloud computing. While very powerful, MapReduce also requires a nontrivial shift in algorithm design strategies. In this paper we provide an overview of MapReduce and types of problems it is suited for. We discuss general strategies for designing MapReduce-based algorithms and provide an illustration using social media analytics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...

متن کامل

Demystifying EPR: A Rookie Guide to the Application of Electron Paramagnetic Resonance Spectroscopy on Biomolecules

Electron Paramagnetic Resonance (EPR) spectroscopy, also known as Electron Spin Resonance(ESR) especially among physicists, is a strong and versatile spectroscopic method forinvestigation of paramagnetic systems, i.e. systems like free radicals and most transition metalions, which have unpaired electrons. The sensitivity and selectivity of EPR are notable andintriguing as compared to other spec...

متن کامل

Cloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming

The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...

متن کامل

Demystifying the digital divide.

policy leaders and social scientists have grown increasingly concerned about a societal split between those with and those without access to computers and the Internet. The U.S. National Telecommunications and Information Administration popularized a term for this situation in the mid-1990s: the “digital divide.” The phrase soon became used in an international context as well, to describe the s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013